Group Members: Travis, Ira, Micah

Goal: This dashboard presents a multi-regional study of U.S. weather behavior in 2024.
It combines exploratory weather analysis, geographical mapping, and predictive modeling, including a classification machine-learning model.

Exploring wind data using the Meteostat Python API:

Source: Meteostat Python API

Time Period: 2024

Frame: Hourly and Daily

Key Variables
  • wspd: Average wind speed (mph)
  • wdir: Mean wind direction (degrees)
  • temp: Temperature (°F)
  • coco: Condition code

Midwest & Northeast Wind Analysis
5-Year Hourly Averages (2020–2024)
Speed (km/h) Direction (°)
Midwest
Cleveland, OH 18.0 226.0
Chicago, IL 15.9 255.0
Detroit, MI 15.8 249.0
Milwaukee, WI 15.4 286.0
Minneapolis, MN 12.6 311.0
Northeast
Buffalo, NY 17.9 243.0
Boston, MA 17.3 276.0
Philadelphia, PA 14.0 294.0
Pittsburgh, PA 11.5 284.0
New York, NY 11.0 294.0
Southeast & West Wind Analysis
5-Year Hourly Averages (2020–2024)
Speed (km/h) Direction (°)
Southeast
Jacksonville, FL 13.0 16.0
Miami, FL 12.5 75.0
Tampa, FL 10.5 47.0
Charlotte, NC 10.2 326.0
Atlanta, GA 7.7 352.0
West
San Francisco, CA 13.4 295.0
Denver, CO 13.0 192.0
Portland, OR 10.8 330.0
Seattle, WA 8.3 306.0
Los Angeles, CA 6.6 330.0


click points on the map to see the wind speeds for the week surrounding a tornado (Shift+ click to select multiple tornados to compare).
the catagories are based on the Enhanced Fujita (EF) Scale: EF0 (65-85 mph), EF1 (86-110 mph), EF2 (111-135 mph), EF3 (136-165 mph), EF4 (166-200 mph), and EF5 (>200 mph). Our data visualizations does not seem exactly match with what we would expect to see. We Suspect the reasons for this may include the following: Tornados are categorized by peak wind speed rather than average wind speed; The wind speed is the wind speed at the closest weather station and the distance between stations and tornados can vary greatly; Tornados radius can vary greatly.


The Goal of this model is to predict wind vectors (speed and direction) at the following International Airports:

  • Pittsburgh International Airport (PIT) – 40.4406, -79.9959
  • Los Angeles International Airport (LAX) – 33.9425, -118.4081
  • Miami International Airport (MIA) – 25.7959, -80.2870
  • Denver International Airport (DEN) – 39.8617, -104.6731
  • Chicago O’Hare International Airport (ORD) – 41.9742, -87.9073
  • Seattle-Tacoma International Airport (SEA) – 47.4502, -122.3088

The Features used for prediction are the locations and wind vectors associated with the 5 closest stations for each airport.

We recorded the following Metrics on the test set:

Metric Value
R² (Coefficient of Determination) 0.7408853538672449
RMSE (Root Mean Squared Error) 5.428108009372506

The \(R^2\) of \(\mathbf{0.74}\) suggests that approximately 74.1% of the variance in the true wind vectors is explained by the model. The RMSE of \(\mathbf{5.43}\) represents the average magnitude of the prediction error in \(\frac{km}{h}\).

R²: 0.7408853538672453
RMSE: 5.428108009372502

Model Objective - This model predicts weather condition codes using an extensive batch of features.

Condition Code Groupings
  • Clear: Condition codes 1-2 (Fair weather, clear skies)
  • Cloudy: Condition codes 3-6 (Cloudy, overcast, foggy conditions)
  • Rain: Condition codes 7-13, 17-20 (Various precipitation types including drizzle, rain, heavy rain)
  • Snow: Condition codes 14-16, 21-22 (Snow, sleet, freezing precipitation)
Note: Storm conditions (codes 23-27) were excluded due to extreme rarity in the dataset Model Construction
  • Algorithm: XGBoost Classifier (selected after comparison with Random Forest and Logistic Regression)
  • City-based splitting 40 training cities tested blindly on 10 test cities
  • Features: 172 engineered features including:
    • Interaction terms: Temperature-pressure ratios, wind-humidity products, etc
    • Lag features: 1, 2, and 3-hour time lag on weather data
    • Differential terms: Pressure changes, humidity changes, wind speed variations
    • Meteorological indices: Heat index, wind chill, vapor pressure deficit
    • Temporal features: Seasonal cycles, solar elevation, daylight indicators
  • Manual class weights applied to address rain/snow underperformance:
    • Clear: 1.5x | Cloudy: 3.0x | Rain: 20.0x | Snow: 325.0x
    • This weighting improved Rain and Snow detection rates but slightly reduced overall accuracy

1. How do weather patterns change by region?

2. What are some case studies of extreme weather?

3. How do geographical features (lakes, oceans, mountains, deserts, plains) impact weather patterns?